AITopics | linear-width neural network

Spectral Evolution and Invariance in Linear-width Neural Networks

Neural Information Processing SystemsDec-24-2025, 21:26:13 GMT

We investigate the spectral properties of linear-width feed-forward neural networks, where the sample size is asymptotically proportional to network width. Empirically, we show that the spectra of weight in this high dimensional regime are invariant when trained by gradient descent for small constant learning rates; we provide a theoretical justification for this observation and prove the invariance of the bulk spectra for both conjugate and neural tangent kernels. We demonstrate similar characteristics when training with stochastic gradient descent with small learning rates. When the learning rate is large, we exhibit the emergence of an outlier whose corresponding eigenvector is aligned with the training data structure. We also show that after adaptive gradient training, where a lower test error and feature learning emerge, both weight and kernel matrices exhibit heavy tail behavior. Simple examples are provided to explain when heavy tails can have better generalizations. We exhibit different spectral properties such as invariant bulk, spike, and heavy-tailed distribution from a two-layer neural network using different training strategies, and then correlate them to the feature learning. Analogous phenomena also appear when we train conventional neural networks with real-world data. We conclude that monitoring the evolution of the spectra during training is an essential step toward understanding the training dynamics and feature learning.

linear-width neural network, name change, spectral evolution and invariance, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.82)

Add feedback

Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks

Neural Information Processing SystemsDec-24-2025, 01:47:00 GMT

We study the eigenvalue distributions of the Conjugate Kernel and Neural Tangent Kernel associated to multi-layer feedforward neural networks. In an asymptotic regime where network width is increasing linearly in sample size, under random initialization of the weights, and for input samples satisfying a notion of approximate pairwise orthogonality, we show that the eigenvalue distributions of the CK and NTK converge to deterministic limits. The limit for the CK is described by iterating the Marcenko-Pastur map across the hidden layers. The limit for the NTK is equivalent to that of a linear combination of the CK matrices across layers, and may be described by recursive fixed-point equations that extend this Marcenko-Pastur map. We demonstrate the agreement of these asymptotic predictions with the observed spectra for both synthetic and CIFAR-10 training data, and we perform a small simulation to investigate the evolutions of these spectra over training.

conjugate kernel, kernel and neural tangent kernel, linear-width neural network, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Review for NeurIPS paper: Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks

Neural Information Processing SystemsJan-24-2025, 14:13:50 GMT

This deserves to be clarified (for instance, by clearly stating which are the actual contributions of the work).

conjugate kernel, kernel and neural tangent kernel, linear-width neural network, (10 more...)

Neural Information Processing Systems

Country:

North America > Guadeloupe (0.06)
Europe > Sweden > Stockholm > Stockholm (0.06)
Europe > France (0.06)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Review for NeurIPS paper: Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks

Neural Information Processing SystemsJan-24-2025, 14:13:43 GMT

The reviewers and I are all confident that this paper will be interesting to the NeurIPS community and should be accepted. In addition to the improvements suggested by the reviewers, I would encourage the authors to expand the description of how to unfold the recursion in Theorem 3.7. The discussion in Appendix A helps, but it is insufficient as it is missing crucial details that would clarify how to interpret some of the ambiguous notation. I think including a detailed worked example would be an important addition.

conjugate kernel, kernel and neural tangent kernel, linear-width neural network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Spectral Evolution and Invariance in Linear-width Neural Networks

Neural Information Processing SystemsOct-11-2024, 15:46:14 GMT

We investigate the spectral properties of linear-width feed-forward neural networks, where the sample size is asymptotically proportional to network width. Empirically, we show that the spectra of weight in this high dimensional regime are invariant when trained by gradient descent for small constant learning rates; we provide a theoretical justification for this observation and prove the invariance of the bulk spectra for both conjugate and neural tangent kernels. We demonstrate similar characteristics when training with stochastic gradient descent with small learning rates. When the learning rate is large, we exhibit the emergence of an outlier whose corresponding eigenvector is aligned with the training data structure. We also show that after adaptive gradient training, where a lower test error and feature learning emerge, both weight and kernel matrices exhibit heavy tail behavior.

linear-width neural network, spectral evolution and invariance, spectral property, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Add feedback

Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks

Neural Information Processing SystemsOct-10-2024, 06:22:19 GMT

We study the eigenvalue distributions of the Conjugate Kernel and Neural Tangent Kernel associated to multi-layer feedforward neural networks. In an asymptotic regime where network width is increasing linearly in sample size, under random initialization of the weights, and for input samples satisfying a notion of approximate pairwise orthogonality, we show that the eigenvalue distributions of the CK and NTK converge to deterministic limits. The limit for the CK is described by iterating the Marcenko-Pastur map across the hidden layers. The limit for the NTK is equivalent to that of a linear combination of the CK matrices across layers, and may be described by recursive fixed-point equations that extend this Marcenko-Pastur map. We demonstrate the agreement of these asymptotic predictions with the observed spectra for both synthetic and CIFAR-10 training data, and we perform a small simulation to investigate the evolutions of these spectra over training.

conjugate kernel, kernel and neural tangent kernel, linear-width neural network, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

linear-width neural network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Spectral Evolution and Invariance in Linear-width Neural Networks

Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks

Review for NeurIPS paper: Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks

Review for NeurIPS paper: Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks

Spectral Evolution and Invariance in Linear-width Neural Networks

Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks